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I. AN INTRODUCTION 
TO 
THE PARALLEL DATABASE COMPUTER ARCHITECTURE 

A database management system (DBMS) must provide fast, 
accurate and efficient information processing. A today’s DBMS 
is only adequate for current information processing require- 
ments, but not adequate for new applications, such as multi- 
media data being utilized in the insurance industry where the 
multimedia database is several orders of magnitude bigger than 
the largest databases found today. For new applications data- 
bases larger than a terabyte (10 bytes) will not be unusual. 
The current DBMS architecture cannot be scaled to such 
Magnitudes and operations on the very large databases. 
Conducting set-oriented database operations in a parallel DBMS 
architecture is an area that has shown increasing promise in 


solving this problem. 


A. PARALLEL ARCHITECTURE FOR DATABASE MANAGEMENT 

Conducting parallel operations in a supercomputer for 
increasing the speed of computations is not a new idea. There 
are numerous production-level, numerical-oriented 
supercomputers. However, this type of numerical supercomputers 
is not effective with the storage and retrieval of a very 


large database. 


An experimental supercomputer for database operations, 
known as the Multibackend Database Supercomputer (MBDS), has 
been developed to provide parallel database operations. This 
prototype system is a research vehicle located in the 
Laboratory for Database Systems Research at NPS and is 
utilized for the study of the design and performance of the 
parallel and scalable database supercomputer. 

The basic motivation of MBDS is to provide an architecture 
that spreads the work of DBMS among multiple backends 
(dedicated computers), each of which executes the same system 
software in parallel, thus drastically improving the DBMS 
performance. (Hsiao, 1983,p.302) 

MBDS is presently configured with eight parallel database 
processors (backends), each of which has three disk drives - 
a smaller one for paging programs, a small one for meta data, 
and a larger one for the base data of the database. The 
architecture of MBDS is illustrated in Figure 1. 

MBDS provides the necessary conditions for database 
Management performance gains and capacity growth through 
parallel database management operations. (Hsiao, 1991) 

MBDS is considered in two major sections, the controller 
section and the backend section. A discussion of the software 


for each follows. 
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FIGURE 1. THE MULTIBACKEND DATABASE S ‘PERCOMPUTER (MBDS) 


B. THE SOFTWARE OF MBDS 
1. The Controller-Computer Software 

The Controller consists of the following five main 
processes: Request (or Transaction) Processing (TP), Post 
Processing (PP), Insert-Information-Generator (IIG), PUT, and 
GET. TP interfaces with the user software of the system, 
identifies each user request, pre-processes the request and 
broadcast the pre-processed request to all the backends. 
Each backend computer in turn places the broadcasted request 
in it’s request queue. 

PP also interfaces with the user software. It performs 
post-processing on the database-transaction results and 
provides the results to the user. PP interfaces with TP which 
allows it to properly identify the intended user. 

IIG controls the insertion of records onto a backend’s 
database store and is responsible for the even distribution of 
each record cluster into the database stores of the backends. 
IIG maintains the space utilization table which provides the 
disk track information required to maintain an even 
distribution of records clustered. The space utilization table 


keeps the following information up-to-date for each cluster of 


records: 
(1) Identifies the backend whose database store contains 
the first trackfull of records of the cluster. 
(2) Identifies the backend whose database store contains 


the last trackfull of records of the cluster. 


(3) Identifies the backend whose database store can 
provide the first available trackfull of storage for 
inserting new records of the cluster. 

Finally, PUT and GET provide the communications link 
among computers, i.e., the controller and backends. PUT places 
messages on the LAN for transmission to other computers via 
either the one-to-one or the broadcasting mode. GET receives 
messages from the other computers via the LAN. 

2. The Backend-Computer Software 

In a backend computer, there are five processes that 
control all the backend operations. They are Directory 
Management (DM), Record Processing (RP), Concurrency Control 
(CC), GET, and™”PuT. 

The Directory Management process handles the managing 
of meta-data. Meta-data is stored information about the base 
data. Collectively, the three meta-data constructs form the 
directory of the database. They are attributes, descriptors, 
and clusters. An attribute is used to represent a category or 
certain common property of the base data, e.g., POPULATION. A 
descriptor is used to describe an unique value or a range of 
values that an attribute can have. For example, (1000 < 
POPULATION < 15000) is a possible descriptor for the attribute 
POPULATION. The descriptors that are defined for an attribute, 


e.g., population ranges, are mutually exclusive in terms of 


their values. A cluster is a group of records such that every 
record in the cluster satisfies the same set of descriptors. 

The condition that the descriptors defined for a given 
attribute have mutually exclusive attribute values is an 
important one. Mathematically, tthe descriptors of the 
attribute serve to derive equivalence classes which 
effectively partition the database into mutually exclusive 
sets of records (clusters). These clusters allow for an even 
distribution of a database onto the backend stores of MBDS. 

The Record Processing (RP) process is responsible for 
Managing the base-data of the database. Specifically, RP 
conducts record retrieval and selection. 

Concurrency Control (GC) is responsible £or 
Maintaining meta-data and base-data integrity during execution 
of a user request or transaction. Since the data requirements 
of a user request may overlap, it is important that data 
consistency is maintained while request are being processed. 

Each backend has a pair oof processes’ for 
communications, the GET process for getting transactions or 
messages from LAN and the PUT process for placing responses or 
messages on LAN. DM, RP, CC, GET, and PUT are the only five 
processes of a backend. These five processes are replicated in 
every backend and are supported by an Unix operating system 
with TCP/IP protocols. Figure 2 illustrates the relationship 


of the controller processes and the backend processes. 


From the frontend computer To the frontend computer 


The Controller Computer = 





The Meta-Data Store The Base-Data Store 
(disks of the (disks of the 
backend computer) backend computer) 


Figure 2. The Organization of MBDS Processes 


C. THE FIVE PRIMARY OPERATIONS OF MBDS 

The five primary operations of MBDS are Insert, Retrieve, 
Delete, Update, and Retrieve-Common. Insert operates on a 
single record at a time while the other four operate on a set 
of records at a time. 

A necessary operation for all databases is the Update 
operation. The Update operation of MBDS is very complex due to 


the following: 


(1) Update is a multiple-stage operation, i.e., each 
update must first stage the data into database 
processors from the database stores for processing, 
then perform necessary updates of values, delete the 
original data and return the newly updated data back 
to database stores. This 4-stage operation must be 
conducted by all the database processors and their 
corresponding database stores in parallel. 
Considerations must be made for coordinations 
among the parallel processors, buffering require- 
ments between multi-stage data movements and loads 
to individual processors. 


(23 Handling of an Update query becomes complicated 
because: 


(a) Clusters and records which may satisfy the 
query of an update must be locked with read/ 
write-deny locks because any use of these 
clusters and records before the update 
operation is complete may generate erroneous 
results. 


(b) Clusters and records which are being updated 
May become new clusters and new records; thus, 
we must now handle the deletion of old clusters 
and records and the creation of new clusters 
and records. 


D. THE GOAL OF THE THESIS 

The current implementation of Update does not work well. 
It can update small database stores only. The Update operation 
fails when tested on larger database stores. The cause of this 
failure is unknown. 

The goal of this thesis is to determine the defect(s) in 
the design and implementation of the Update operation. We must 
find a theory as to why the original Update does not work and 
then verify this theory through tests and analyses. Next,there 
is a need for a theory on how to correct the defective Update 
operation. Time permitting, we may implement the proposed 


corrective measures. 


E. THE ORGANIZATION OF THE THESIS 

The thesis is organized into three chapters in addition to 
this introduction. In Chapter II, we describe the Update 
operation logically in the context of the Multibackend 
Database Supercomputer. In addition, we present theories on 
why the current Update operation is defective. In Chapter III, 
we propose new design and implementation in order to overcome 
defects of the present Update operation. In Chapter IV, we 
summarize our findings and indicate unfinished work which will 


require others to carry out. 


1 THE UPDATE OPERATION WITHIN MBDS 


Chapter II presents a logical description of the UPDATE 
operation within the context of the MBDS environment and 


discusses theories on why UPDATE does not perform well. 


A. BACKGROUND INFORMATION FOR UNDERSTANDING THE UPDATE 
ALGORITHM 
1. The Base Data Organization 
As a database computer, MBDS must have a data model to 
characterize its database and to allow the user to refer to 
the database in terms of its logical properties. The data 
model used for MBDS is the attribute-based data model (ABDM). 
Every piece of data in the database is characterized 
in ABDM as an attribute-value pair. An attribute-value pair is 
a member of the Cartesian product of the attribute set and the 
value domain of the attribute. As an example, <POPULATION, 
30000> is an attribute-value pair having 30000 as the value 
for the POPULATION attribute. A record contains at most one 
attribute-value pair for each attribute defined in the 
database. Certain attribute-value pairs of a record are called 
the directory keywords of the record, because either the 
attribute-value pairs or their ranges are kept in a directory 
for indentifying records. Those attribute-value pairs which 


are not kept in the directory are called non-directory 
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keywords. The rest of the record is textual information which 
is referred to as the record body. An example of a record is 
shown below: 
(<FILE, USCensus>, <CITY,Monterey>, <POPULATION,30000>, 
{Temperate, Climate}) 
The angle brackets, <,>, enclose an attribute-value pair. The 
curly brackets, 1 he enclose the record body and the entire 
record is enclosed within the parenthesis. All the records of 
the database comprise its base data. Realistically, there are 
thousands, or even millions, of base data or records, in the 
database. 

The records of the database are identified by keyword 
predicates. A keyword predicate is a tuple consisting of a 
directory attribute, a relational operator (=,<,>...), and an 
attribute value. An example of a keyword predicate would be 
POPULATION < 30000. Keyword predicates, combined in 
disjunctive normal form, comprise a query of the database. 
The query 

(FILE = USCensus and CITY = Monterey) 
will be satisfied by all records of the USCensus file with the 
City of Monterey. 
2. The Meta Data Structure 

To manage the database, MBDS uses meta data which are 

organized into three tables: attribute table (AT), descriptor- 


to-descriptor-id table (DDIT), and the cluster-definition 


a: 


table (CDT), examples of which are given in Figure 3. AT maps 
(directory) attributes of AT to the descriptors defined on 
them. DDIT maps each descriptor to a unique descriptor id. CDT 
maps descriptor-id sets to cluster ids. Each entry consists of 
a unique cluster id, a set of descriptor ids whose descriptors 
define the cluster, and ids of the records that are in the 
cluster. Thus, to access the user data, MBDS must first access 
meta data via the AT, DDIT, and CDT. 
3. The Distributon of Meta and Base Data On Database 
Stores 
The distribution of meta data and base data on their 
separate database stores takes place differently, although 
both types provide for parallel accessing of data. A 
description of their differences is provided in the following 
sections. 
a. The Meta-data Distribution 
Meta data are usually one or two orders of 
magnitude smaller in size than base data. Due to the 
relatively small size of meta data, the designers of MBDS 
decided to replicate the meta data onto each backend’s 
database store. Consequently, all the backends can access 
their own meta-data stores and identical sets of attribute, 
descriptor, and cluster tables, im parallel 


(Hsiao,1991). 


a2 
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b. The Base-data Distribution 


Base data comprises the bulk of a database. 


Therefore, they are not replicated for storage. Also, they are 


stored on each backend’s own high-capacity disk drives using 


the following distribution algorithm: 


(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


From the Cluster table, the controller picks up a 
cluster identifier and its associated records; 


The controller blocks a variable-size cluster into 
fixed-size trackfuls of records; 


The controller determines the identifiers of the 
back ends, each of which can provide a track of 
available storage for the cluster identified (recall 
that the controller IIG has a storage utilization 
map to keep track of such information) ; 


The controller sends in parallel all the trackfuls of 
records to the backends identified; 


Each identified backend places its block of one or 
more trackfuls of clustered records into its base- 
data store and enters identifiers of records stored 
onto the replicated CDT entry corresponding to the 
cluster on the meta-data store. 


The controller then updates its space utilization map 
with respect to this cluster; and 


The entire procedure is repeated for all subsequent 
clusters. 


This one-track-per-backend database distribution 


algorithm evenly distributes records of a cluster over a set 


of separate, parallel database stores. Subsequent accesses to 


the records of a cluster can now be processed in parallel. 


14 


B. THE UPDATE OPERATION ON MBDS 

An UPDATE request is used to modify records of the 
database. The format of an UPDATE request consists of two 
parts, the query and the modifier. The query specifies which 
records of the database are to be modified. The modifier 
specifies how the records being modified are to be updated. 
For example, the following UPDATE request 

UPDATE (FILE= USCensus) (POPULATION = POPULATION + 5000) 
will modify all of the records of the USCensus file by 
increasing all populations by 5000. In this example, (FILE = 
USCensus) is the query and (POPULATION = POPULATION + 5000) is 
the modifier. 

1. The Update Algorithm 

(a) The Update request is broadcasted by the controller 
to all the backends. 

(b) The Directory Management process on each backend 
performs descriptor processing and address generation for the 
Update request. Descriptor processing consists of determining 
the descriptor ids of the descriptors that satisfy the 
keywords in the query. This set of descriptors, which 
Satisfies the query, is mapped to the Cluster-definition table 
(CDT) to determine the appropriate cluster id. Given the 
cluster id, the record ids are readily available in CDT. The 
set of selected record ids are passed to the address- 


generation function which determines the set of track 
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addresses in the secondary storage. These addresses permit 
accesses to the records required for the Update operation. The 
set of cluster ids are sent to Concurrency Control to be 
locked until completion of the Update operation. 

(C) Directory Management passes this set of track 
addresses to Record Processing for record retrieval and 
updating. Updating takes place as follows: 

(1) Fetch the entire set of tracks from the database 
store (secondary memory). Data are staged in the primary 
memory for quicker accesses during the record-modification 
phase of the operation. 

(2) Reserve a result buffer (updated records that 
change clusters are temporarily stored here prior to being 
sent to the controller for insertion into a new cluster). 

(3) For each address in the set of track addresses, 
fetch the track from the disk into the track buffer. 

(4) Examine the records in the track buffer one-by- 
one. If a record is marked for deletion, disregard it. If a 
record does not satisfy the query of the request, disregard 
it. If a record satisfies the query of the request, compute 
the new value according to the modifier and update the record 
in the track buffer. Send the old and new values of the 
updated record to DM to determine if the record has changed 
clusters. If the newly updated record changes cluster, then 
add the record to the result buffer and mark the old record 


for deletion in the track buffer. 
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(5) After examining all records in the track buffer, 
store the track buffer back to the disk. 

(6) Flush the result buffer and send the results to 
the IIG process in the Controller for insertion into a new 


cluster. 


C. THEORIES ON WHY THE UPDATE OPERATION IS NOT FULLY 
FUNCTIONAL WITHIN MBDS 
1. Theory One: Erroneous Hashing Techniques 
In the Update operation, each backend must first 
select a set of records to be operated on. Regardless of its 
size, this record set is retrieved from the database stores 
and stored in the virtual memory temporarily for subsequent 
operations. Hashing techniques are used to obtain the 
addresses for the temporary storage. The hashing technique 
used may be erroneous, causing nearly all of the records to be 
stored at the same virtual memory locations. 
2. Theory Two: Erroneous Allocations of Memory 
Recall that the current Update algorithm works for 
small size databases, e.g., 30 record database, but not for 
larger size databases, say, 500 records or more. Also, 
consider the fact that MBDS supports a concurrent environment 
(multiple users active on the system simultaneously). This 
concurrency capability forces a partitioning of the primary 
memory throughout the system. Limiting the available 


memory per user coupled with the fact all modifications for 
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concurrency capability forces a partitioning of the primary 
memory throughout the system. Limiting the available 
memory per user coupled with the fact all modifications for 
an UPDATE is done in the memory leads us to theorize that 
there may be a possible problem with memory 


allocation/availabiltiy. 
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et . AN ALTERNATIVE METHOD 
OF 
IMPLEMENTING THE UPDATE OPERATION 
In this chapter we focus on the theories presented in 
Chapter II. We provide evidence to substantiate the most 


promising theory and explain an alternative implementation 


method to overcome the problem with UPDATE. 


A. DISCUSSION OF THEORY ONE: ERRONEOUS HASHING TECHNIQUES 

The original design specifications for the UPDATE 
operation, as discussed in Chapter II, were never fully 
implemented. In particular, the staging of the entire set of 
tracks in the virtual memory via a hashing function was never 
implemented. Instead of retrieving the entire set of tracks to 
be updated during a single fetch, the UPDATE routine fetches 
one track at a time and puts it into the virtual memory. Then, 
performs all the required update steps on the records from 
this track and stores the newly updated track back to the base 
data disk. 

Therefore, our research efforts in this area led to the 
validation of theory one; and through test and analysis we 
discovered and fixed a logic error in the hashing function 
utilized by the RETRIEVE operation. 

The original hashing function utilized by the RETRIEVE 


operation was hashing all the retrieved records into the same 
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location in the virtual memory. Our modified version of the 
hashing function accomplishes the desired goal of a hashing 
function,i.e., an even distribution of data throughout the 


Memory area. 


B. DISCUSSION OF THEORY TWO: ERRONEOUS ALLOCATIONS OF MEMORY 

For theory two we generated several test databases of 
various sizes, and attempted to run the UPDATE routine on each 
to determine the point of operational failure. 

Initially, the UPDATE operation performed successfully on 
a database of 30 records. Every UPDATE query used throughout 
our test and analysis phase was written to update every record 
in the database. Therefore, a successful run on a database of 
30 records, implies 30 records were updated. Next, we tested 
the UPDATE operation on a database of 35 records and the 
entire system crashed. Through debugging we discovered the 
point of system failure. The system crashed while attempting 
to perform a write routine from the real memory on backend one 
to the paging disk on the same backend. Trouble-shooting of 
this problem led to the discovery of the following: 

Each backend has only 4 million bytes of the real memory. 
From this 4 megabytes there are only 1.8 megabytes available 
for conducting database operations such as UPDATE. Therefore, 
the code required to perform an UPDATE operation must be paged 
in and out of the real memory as required. Thus, one use of 


the paging disk is to partition the executable code of each 
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backend process into fixed size pages of 1024 bytes and 
provide a storage area for these pages. A second use of this 
paging disk is to provide a temporary storage area for a 
snapshot of the system. In particular, a temporary file is 
Maintained on the paging disk to allow a user to log on, 
conduct operations, and then log off in the middle of a 
session without losing his/hers current status of the system. 

Since this is a multiple user system, there exists the 
possibility of having several separate snapshot files residing 
on the paging disk simultaneously. 

The MBDS system crashed while conducting an UPDATE 
operation on 35 records because the paging disk became full 
with too many snapshot files left from prior sessions. 

A general cleanup of the paging disk, i.e., erasing of all 
the unnecessary snapshot files can free up enough space to 
allow a successful UPDATE operation on the 35 record database. 

Next we increased the size of the test database to 50 
records and performed a successful UPDATE operation. At this 
point we increased the size of the test database by increments 
of 50 records and ran the UPDATE operation. Each time the 
UPDATE operation ran successfully. 

Our final test database was 450 records. A successful 
UPDATE was performed on this database, thus convincing us that 
the problem was solved. The solution did not lie within the 
UPDATE algorithm as originally suspected, but within the 


underlying operating system environment. 
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Iv. CONCLUSIONS 
In this final chapter, we provide some concluding remarks. 
In the first part of the chapter, we furnish a summary of the 
thesis work. Next, we discuss the difficulties and problems 
encountered while completing the work for this thesis. 


Finally, we provide some recommendations for future efforts. 


A. THE SUMMARY 

Performance problems and upgrade issues have always been 
an obstacle in traditional mainframe-based, single-backend 
database systems. Never has this been more evident today as 
the new database application requires a database which is 
several orders of magnitude bigger than the largest database 
found in conventional applications. The Multi-Backend Database 
Supercomputer (MBDS) attempts to overcome this type of problem 
through specialization and parallelization of the database 
operations on multiple dedicated backend computers. 

A critical tool for any DBMS is the UPDATE operation. The 
UPDATE operation on MBDS was not fully functional and required 
further testing and analysis in order to determine its defect. 

In this thesis effort, we have first gained a thorough 
knowledge of the entire MBDS. Secondly, we have acquired a 
working knowledge of the C programming language in order to 


understand the MBDS operations. Then, we have developed 
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numerous test databases and conducted operational testing of 
the UPDATE routine to determine the point of any malfunction. 
Having understood the capabilities of the original UPDATE 
operation, we then dissect the UPDATE algorithm and look for 
a logic flaw. Once convinced that the algorithm has been 
sound, we focus our attention towards the underlying operating 
system which supports the UPDATE operation. In particular, we 
have looked at two possible problem areas: erroneous hashing 
techniques, and erroneous allocations of the memory. A 
thorough investigation of both has led us to the following 
corrections to MBDS: 

@® Our modified version of the hashing function is utilized 
to retrieve records into the virtual memory. It now 
supports an even distribution of records throughout the 
memory area. 

@ The UPDATE operation now performs flawlessly with large 


test databases, no longer crashing the entire system on 
very small databases. 


B. DIFFICULTIES ENCOUNTERED 

The size of MBDS and the UNIX operating system both 
contributed to a steep learning curve for students working on 
the MBDS system. The amount of information initially required 
by a student to work on MBDS is very large, and requires a 
substantial portion of the students alloted thesis time. 


Additionally, the student must become very proficient in the 
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C programming language in order to understand the MBDS 
implementation. 

Once the student has understood MBDS and the UNIX 
operating system and becomes proficient in C, the student must 
develop a theory or theories of possible defects in UPDATE. 


This has not been a trivial task. 


C. RECOMMENDATIONS FOR FUTURE EFFORTS 
The original design specifications for the UPDATE 

operation allow for the modifier of an UPDATE query to be one 
of five types: 

@® Type 0: <attribute = constant> 

@® Type I: <attribute = f(attribute) > 

® Type II: <attribute = f(attributel) > 

® Type III:<attribute = f(attributel) of Query> 


® Type IV: <attribute = f(attributel) of Pointer> 


Let a record whose attribute is being modified be referred 
to as the record being modified. Then a type-0 modifier sets 
the new value of the attribute being modified to constant. A 
type-I modifier sets the new value of the new attribute being 
modified to be some function of its old value in the record 
being modified. A type-II modifier sets the new value of the 
attribute being modified to be some function of some other 
attribute value in the record being modified. A type-III 


modifier sets the new value of the attribute being modified to 
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be some function of some other attribute value in another 
record uniquely identified by the query in the modifier. 
Finally, a type-IV modifier sets the new value of the 
attribute being modifier to be some function of some other 
attribute value in another record identified by the pointer. 

Currently, only the type-0 modifier is implemented within 
the UPDATE operation. Thus, future work on the UPDATE 
operation will be to implement the other four types of 


modifiers for an UPDATE query. 
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